Error Mining on Syntactic Parser Output
نویسندگان
چکیده
We introduce an error mining technique for automatically detecting errors in resources that are used in parsing systems. We applied this technique to parsing results produced on several million words by two distinct parsing systems, which share the syntactic lexicon and the pre-parsing processing chain. We are thus able to identify incorrectness and incompleteness sources in the resources. In particular, by comparing both systems’ results, we are able to isolate problems coming from shared resources from those coming from grammars. MOTS-CLÉS : analyse syntaxique, lexique syntaxique, fouille d’erreurs.
منابع مشابه
Feature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملChallenges in Mapping of Syntactic Representations for Framework- Independent Parser Evaluation
We explore some of the issues and challenges created by the incompatibility of diverse representation schemes for syntactic parsing. In particular, we examine the problem of output format conversion for evaluation of parsers that use different formalisms. We discuss recent related efforts, and present an evaluation of different parsers that use representations that vary not only in formalisms, ...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملREALEC learner treebank: annotation principles and evaluation of automatic parsing
The paper presents a Universal Dependencies (UD) annotation scheme for a learner English corpus. The REALEC dataset consists of essays written in English by Russian-speaking university students in the course of general English. The original corpus is manually annotated for learners’ errors and gives information on the error span, error type, and the possible correction of the mistake provided b...
متن کاملRobust Probabilistic Predictive Syntactic Processing
of “Robust Probabilistic Predictive Syntactic Processing” by Brian Edward Roark, Ph.D., Brown University, May, 2001. This thesis presents a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The parser builds fully connected derivations incrementally, in a single pass from left-to-right across the string. We argue that t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TAL
دوره 49 شماره
صفحات -
تاریخ انتشار 2008